此次資料採用kaggle網站提供的google play store APP的資料, 先行去除無效或錯誤的資料後進行分析,主要分析應用程式的評分分布。
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
#data from https://www.kaggle.com/lava18/google-play-store-apps
mydata = read.csv("googleplaystore.csv")
在分析之前,先整理此次所採計的資料。 - 採計的應用程式共9244項,共33種類型,以Family家庭類型最多,其次為Game遊戲、再者為Tools工具類型。
summary(mydata)
## App
## ROBLOX : 9
## CBS Sports App - Scores, News, Stats & Watch Live : 8
## Candy Crush Saga : 7
## Duolingo: Learn Languages Free : 7
## ESPN : 7
## Bleacher Report: sports news, scores, & highlights: 6
## (Other) :9200
## Category Rating Reviews
## FAMILY :1725 Min. :1.000 Min. : 1
## GAME :1073 1st Qu.:4.000 1st Qu.: 182
## TOOLS : 727 Median :4.300 Median : 5718
## PRODUCTIVITY : 350 Mean :4.191 Mean : 501118
## MEDICAL : 343 3rd Qu.:4.500 3rd Qu.: 79312
## COMMUNICATION: 326 Max. :5.000 Max. :78158306
## (Other) :4700
## Size Installs Type Price
## Varies with device:1607 1000000+ :1557 Free:8604 0 :8604
## 12M : 161 10000000+:1236 Paid: 640 $2.99 : 112
## 14M : 160 100000+ :1142 $0.99 : 107
## 11M : 159 10000+ : 999 $4.99 : 69
## 13M : 157 5000000+ : 733 $1.99 : 58
## 15M : 155 1000+ : 708 $3.99 : 58
## (Other) :6845 (Other) :2869 (Other): 236
## Content.Rating Genres Last.Updated
## Adults only 18+: 3 Tools : 726 3-Aug-18 : 307
## Everyone :7339 Entertainment: 526 2-Aug-18 : 274
## Everyone 10+ : 386 Education : 468 1-Aug-18 : 269
## Mature 17+ : 446 Productivity : 350 31-Jul-18: 269
## Teen :1069 Action : 349 30-Jul-18: 197
## Unrated : 1 Medical : 343 25-Jul-18: 156
## (Other) :6482 (Other) :7772
## Current.Ver Android.Ver
## Varies with device:1390 4.1 and up :2029
## 1 : 473 Varies with device:1294
## 1.1 : 206 4.0.3 and up :1222
## 1.2 : 132 4.0 and up :1118
## 2 : 128 4.4 and up : 862
## 1.3 : 118 2.3 and up : 578
## (Other) :6797 (Other) :2141
plot_ly(mydata, x = ~Category, color = ~Category, type = "histogram")
my.plot3 <- ggplot(mydata, aes(x = Rating))
my.plot3 <- my.plot3 +
geom_histogram(binwidth = 0.1, fill = "steelblue")
my.plot3
plot_ly(mydata, x = ~Rating, color = ~Category, type = "box")